Running Head: the Arc Nonword Database (almost) a Million Nonwords: the Arc Nonword Database Phonotactically-legal Strings

نویسندگان

Kathleen Rastle

Jonathan Harrington

Max Coltheart

Breck Thomas

چکیده

This paper documents a Web-based psycholinguistic resource, the ARC Nonword Database, which complements the existing MRC Psycholinguistic Database (which provides words only). Pseudohomophones and non-pseudohomophonic nonwords were devised based on phonotactic and orthographic constraints of Australian English. These items can be selected from the ARC Nonword Database on the basis of a wide variety of properties known or suspected to be of theoretical importance for the investigation of reading. We hope that this database will facilitate such investigations. (Almost) a million nonwords 3 (Almost) a Million Nonwords: The ARC Nonword Database Much research in the experimental psychology of reading investigates relationships between various psycholinguistic properties of words and indices of reading behaviour such as naming latency or visual lexical decision. Such work requires careful selection of words so that they vary appropriately on the dimension of interest and are matched appropriately on other, potentially confounding, variables. The MRC Psycholinguistic Database (Coltheart, 1981) was developed as a tool to facilitate such word selection. Words can be selected from that database on a large variety of psycholinguistic criteria. A particularly convenient Web version of this database is now browsable at http://www.psy.uwa.edu.au/uwa_mrc.htm. That database contains only words, however, and that is a limitation since the study of nonword reading has also proven useful in understanding the mechanisms which underlie visual word recognition and word reading aloud, in large part because nonword reading and word reading seem to involve common processes. For example, neighbourhood size (N) facilitates both nonword reading aloud (e.g., McCann & Besner, 1987; Peereman & Content, 1995) and word reading aloud (e.g., Andrews, 1992). Body consistency (Glushko, 1979) and word length (e.g., Weekes, 1997; Rastle & Coltheart, in press) similarly affect both nonword and word reading aloud, though there is a clear interaction between lexicality and the latter variable, word length. Furthermore, pseudohomophones -nonwords that sound identical to words -are named more quickly than non-pseudohomophonic nonwords (e.g., McCann & Besner, 1987; Taft & Russell, 1992) and are rejected more slowly in the lexical decision task than are non-pseudohomophonic nonwords (e.g., Coltheart, Davelaar, Jonasson, & Besner 1977); these results show that nonword reading is influenced by (Almost) a million nonwords 4 lexical factors. Finally, it is clear that nonwords must be chosen carefully when used as filler items, as the pattern of word-reading effects can change substantially not only as a function of whether nonwords are included as fillers or not (e.g., Tabossi & Laghi, 1992; Baluch & Besner, 1991; Monsell, Patterson, Graham, Hughes, and Milroy, 1992), but also as a function of the characteristics of the nonword fillers (e.g., Pugh, Rexer, Peter, & Katz, 1994). Thus, as theories of reading become increasingly advanced and more precise, it has become extremely important to select nonword filler and target items systematically, based on properties known to affect nonword reading. To this end, we have created a database of monosyllabic nonwords accessible via a world-wide-web (WWW) interface, which can be used to select nonwords and pseudohomophones on the basis of a variety of psycholinguistic dimensions. The database URL is http://rosella.bhs.mq.edu.au/~nwdb. The Interface The interface was modelled after the Web interface for the MRC Psycholinguistic Database referred to above. The nonword database contains 66,711 pseudohomophones and 888,355 non-pseudohomophonic nonwords. Each of these nonwords is listed with a number of indices reflecting various properties such as number of letters, number of phonemes, number of body friends, number of body enemies, bigram frequency, and trigram frequency. The properties which are included in the database are shown in Figures 1 and 2. -Insert Figures 1 and 2 about here -Users first choose whether to select a set of nonwords or a set of pseudohomophones, and then specify how many items they would like produced by the search. They are then asked to indicate a minimum value, a maximum value, or (Almost) a million nonwords 5 no value for each of the properties shown in Figure 1. If a value is not entered in a particular field, that property is not considered in the selection. Users are then asked to indicate which properties they would like displayed in the output. This aspect of the interface is shown in Figure 2. Selection from the database is random in the following sense. Since the databases are so large, randomly reordering each database takes so long that it would not be practical to do this every time a user wished to select a set of items. Instead, each database was randomly ordered just once, and that random order is fixed. Random selection is achieved for each request by entering the relevant database at some random point (computed independently for each user request) and selecting the first N items meeting the criteria specified by the user that are encountered starting from that random point (where N is the number of items requested by the user). Creation of the Database The nonword database was created by first generating 123,407 phonotactically-legal monosyllabic strings of Australian English. This list of phonotactically legal strings was divided into two further lists, one containing all strings homophonic with an Australian-English word and another containing nonpseudohomophonic nonwords. To each of these lists, we applied a principled phonology-to-orthography mapping based on the statistical occurrence of various phoneme-grapheme correspondences (PGCs) in Australian-English monosyllabic words. All of the properties contained in the database (e.g., N, number of body neighbours) were then calculated for each nonword based on the CELEX English database of monosyllabic word forms (Baayen, Piepenbrock, & van Rijn, 1993). Following are details concerning each of the steps in creating the databases of pseudohomophones and nonwords. (Almost) a million nonwords 6 Phonotactically-Legal Strings All languages have restrictions on the way in which phonemes can be sequentially arranged in a syllable – known as phonotactic constraints that define the legal sequential arrangement of phonemes in a syllable. Although phonotactic constraints vary across languages (e.g. German allows syllables to begin with /kn/ whereas English does not), they are often strongly influenced by the sonority (Jespersen, 1904; Saussure, 1916; Sievers, 1881) and sequential arrangement of segments which gives rise to a sonority profile (Clements, 1990; Hooper, 1972; Kiparsky, 1981; Lowenstamm, 1981; Rice, 1992; Selkirk, 1984; Zec, 1995; Zwicky, 1972). A segment's sonority can be defined in terms of the extent to which the vocal tract is constricted in speech production (Beckman, Edwards, & Fletcher, 1992; de Jong, 1995). For example, voiceless oral stops and vowels (especially open vowels) are at opposite ends of the sonority scale because in producing /t/, the vocal tract is tightly constricted at a certain stage of its production (low sonority) whereas in /a/, the vocal tract is wide open and acoustic energy can radiate from the lips. An acousticperceptual correlate of sonority is also loudness: sounds with high sonority are generally louder than those with low sonority (Lindblom & Sundberg, 1971). The sonority scale from least to greatest sonority can be defined as: oral fricatives nasal stops liquids glides vowels

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Running head: A MULTILINGUAL PSEUDOWORD GENERATOR

Pseudowords play an important role in psycholinguistic experiments, either because they are required in performing tasks such as lexical decision, or because they are the main focus of study, as in nonword reading or nonce inflection tasks. We present a pseudo-word generator that improves on current methods. It allows for the generation of written polysyllabic pseudowords that obey a given lang...

متن کامل

Differences in the nonword repetition performance of children with and without specific language impairment: a meta-analysis.

PURPOSE This study presents a meta-analysis of the difference in nonword repetition performance between children with and without specific language impairment (SLI). The authors investigated variability in the effect sizes (i.e., the magnitude of the difference between children with and without SLI) across studies and its relation to several factors: type of nonword repetition task, age of SLI ...

متن کامل

RUNNING HEAD: ORTHOGRAPHY AND VOCABULARY ACQUISITION Orthographic facilitation in oral vocabulary acquisition

An experiment investigated whether exposure to orthography facilitates oral vocabulary learning. Fifty-eight typically developing children aged 8-9 years were taught 12 nonwords. Children were trained to associate novel phonological forms with pictures of novel objects. Pictures were used as referents to represent novel word meanings. For half of the nonwords children were additionally exposed ...

متن کامل

Can the First Letter Advantage Be Shaped by Script-specific Characteristics? Acknowledgements

We examined whether the first letter advantage that has been reported in the Roman script disappears, or even reverses, depending on the characteristics of the orthography. We chose Thai because it has several " nonaligned " vowels that are written prior to the consonant but phonologically follow it in speech (e.g., แฟน <ɛ:fn> is spoken as /fɛ:n/) whereas other " aligned " vowels are written an...

متن کامل

Single- versus dual-process models of lexical decision performance: insights from response time distributional analysis.

This article evaluates 2 competing models that address the decision-making processes mediating word recognition and lexical decision performance: a hybrid 2-stage model of lexical decision performance and a random-walk model. In 2 experiments, nonword type and word frequency were manipulated across 2 contrasts (pseudohomophone-legal nonword and legal-illegal nonword). When nonwords became more ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Running Head: the Arc Nonword Database (almost) a Million Nonwords: the Arc Nonword Database Phonotactically-legal Strings

نویسندگان

چکیده

منابع مشابه

Running head: A MULTILINGUAL PSEUDOWORD GENERATOR

Differences in the nonword repetition performance of children with and without specific language impairment: a meta-analysis.

RUNNING HEAD: ORTHOGRAPHY AND VOCABULARY ACQUISITION Orthographic facilitation in oral vocabulary acquisition

Can the First Letter Advantage Be Shaped by Script-specific Characteristics? Acknowledgements

Single- versus dual-process models of lexical decision performance: insights from response time distributional analysis.

عنوان ژورنال:

اشتراک گذاری